home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Cream of the Crop 1
/
Cream of the Crop 1.iso
/
EDITOR
/
ZKWIK11.ARJ
/
ZK.DOC
< prev
next >
Wrap
Text File
|
1991-12-04
|
11KB
|
277 lines
ZK (formerly ZipKwik)
release 1.1
KWIC (Key Word in Context) and KWOC (Key Word Out
of Context) indexing system for ASCII text files.
S H A R E W A R E
ZK is shareware. If you use the program and find it useful,
please send $ 10.00 to the author at the address below:
Clyde W. Grotophorst
Route 1, Box 296
Hamilton, VA 22068
USA
Registered users gain extended access to the GMUtant
Software BBS (703) 993-2219 1200/2400/9600 v.32. You
will also be able to obtain assistance from the author
in using (or enhancing) ZK...
Registered users may purchase ZK source code
for $ 15.00. Source code for ZK written in Turbo Pascal 6.0
To compile, you must have two TurboPower Software
products: Object Professional or Turbo Professional
and (for sort routines) B-Tree Filer (5.22)).
(c) 1991, Clyde W. Grotophorst, GMUtant Software
QUICKSTART: To see what ZK can do, type ZK TESTFILE at your DOS prompt.
A KWIC or KWOC index of the contents in TESTFILE will be created.
Output filename: TESTFILE.QWK
------------
INTRODUCTION
------------
ZK is a program which produces KWIC (Key Word In Context) or KWOC (Key Word
Out of Context) indexes from ASCII text files. Here's a sample of what
ZK produces (using an listing of some magazine titles):
Before processing (a text file you create with your wordprocessor):
Fixing your Leading Edge PC Popular Mechanics Nov 79, p. 43-4
Is KWIC better than KWOC? Index methods compared. ASIS Nov 1982 pg. 44-56.
Who says you need a mainframe for KWIC indexing GMUtant News Jan 89 page 2
Are KWIC indexes actually promoted by paper companies? True Stories Nov 88
After processing (a KWIC index, ZK output option #1):
Stories Nov 88 Are KWIC indexes ║ACTUALLY ║promoted by paper companies? True
n KWOC? Index methods compared. ║ASIS ║Nov 1982 pg. 44-56. Is KWIC better tha
Byte Oct 88, page 79 Pascal for ║BEGINNERS ║
SIS Nov 1982 pg. 44-56. Is KWIC ║BETTER ║than KWOC? Index methods compared. A
Interface: Findings of a survey ║BYTE ║Nov 87 pg. 24-32 What is the User-
8, page 79 Pascal for Beginners ║BYTE ║Oct 8
PC Magazine 12:14-17 Oct 1988 ║C ║Compilers: A product review
ove Sanscrit, you're halfway to ║C ║programming. Micro 13:1 pg. 70+ If you know and l
exes actually promoted by paper ║COMPANIES ║True Stories Nov 88 Are KWIC ind
better than KWOC? Index methods ║COMPARED ║ASIS Nov 1982 pg. 44-56. Is KWIC
PC Magazine 12:14-17 Oct 1988 C ║COMPILERS ║A product review
ng for the Perfect Sort Popular ║COMPUTING ║Oct 87, pg. 19-23 Searchi
9, p. 43-4. Fixing your Leading ║EDGE ║PC Popular Mechanics Nov 7
Apr 79, p. 2 Sort Routines for ║EVENING ║Hours 80 Micro
-32 What is the User-Interface: ║FINDINGS ║of a survey BYTE Nov 87 pg. 24
lar Mechanics Nov 79, p. 43-4. ║FIXING ║your Leading Edge PC Popu
d a mainframe for KWIC indexing ║GMUTANT ║News Jan 89 page 2 Who says you nee
know and love Sanscrit, you're ║HALFWAY ║to C programming. Micro 13:1 pg. 70+ If you
rogramming. Micro 13:1 pg. 70+ ║IF ║you know and love Sanscrit, you're halfway to C p
4-56. Is KWIC better than KWOC? ║INDEX ║methods compared. ASIS Nov 1982 pg. 4
s? True Stories Nov 88 Are KWIC ║INDEXES ║actually promoted by paper companie
s you need a mainframe for KWIC ║INDEXING ║GMUtant News Jan 89 page 2 Who say
ming. Micro 13:1 pg. 70+ If you ║KNOW ║and love Sanscrit, you're halfway to C program
ed. ASIS Nov 1982 pg. 44-56. Is ║KWIC ║better than KWOC? Index methods compar
panies? True Stories Nov 88 Are ║KWIC ║indexes actually promoted by paper com
o says you need a mainframe for ║KWIC ║indexing GMUtant News Jan 89 page 2 Wh
pg. 44-56. Is KWIC better than ║KWOC ║Index methods compared. ASIS Nov 1982
K W O C
Below an excerpt of a KWOC index (output option #3) appears:
║ACTUALLY ║ Are KWIC indexes ACTUALLY promoted by paper companies? True Stories Nov 88
║ASIS ║ Is KWIC better than KWOC? Index methods compared. ASIS Nov 1982 pg. 44-56.
║BETTER ║ Is KWIC BETTER than KWOC? Index methods compared. ASIS Nov 1982 pg. 44-56.
║COMPANIES ║ Are KWIC indexes actually promoted by paper COMPANIES? True Stories Nov 88
║COMPARED ║ Is KWIC better than KWOC? Index methods COMPARED. ASIS Nov 1982 pg. 44-56.
║EDGE ║ Fixing your Leading EDGE PC Popular Mechanics Nov 79, p. 43-4
║FIXING ║ FIXING your Leading Edge PC Popular Mechanics Nov 79, p. 43-4
║GMUTANT ║ Who says you need a mainframe for KWIC indexing GMUTANT News Jan 89 page 2
║INDEX ║ Is KWIC better than KWOC? INDEX methods compared. ASIS Nov 1982 pg. 44-56.
║INDEXES ║ Are KWIC INDEXES actually promoted by paper companies? True Stories Nov 88
What happens is pretty simple--although while developing ZK I began to wonder
if it was impossible. All the lines in your original file are rotated,
with a unique word highlighted each time. Actually, not every word--you may
select up to 400 words (stopwords) which will not appear as main headings
in your final KWIC or KWOC listing.
When finished, simply print out the final file using your word processor,
program editor, or the DOS type command.
As you can see, KWIC indexing creates rather large files. The sample at
the beginning of this introduction grew from 4 to 38 lines. To keep this
growth manageable, use your stopword list often and wisely.
Performance:
ZK produces a KWOC index of a 217 line input file in 1 minute
(running on a DELL 310 (386/20Mhz) with disk caching).
Files included:
ZKSTOP.LST - Your list of stopwords. Comments inside this
text file explain its use. Read all comments
before you modify the list.
ZK.DOC - This documentation file
ZK.EXE - The program.
TESTFILE - A sample file used to generate a sample KWIC
or KWOC index. To see a KWIC file, enter
ZKWIC TESTFILE at the DOS prompt. For a KWOC
version, enter ZKWOC TESTFILE.
ZK processing will add about 15 characters to your lines, so
plan accordingly. We've found that with a printer capable of
putting 120 lines on a page, a line limit of 100 characters in
your input file works well.
b. There is no limit on the length of your input file...although
realize that your KWIC/KWOC indexes will be approximately 10 times
larger than your original file (since each word gets its own
line in the final product).
c. If you're entering book or journal titles, use a standard form
of abbreviation for things like page, date, volume no, and the
like. If you're consistent with these, you'll only need one
entry in your stoplist to purge them as main entries in your
final product. Also, leave a space between things like pg.
and the number (e.g., pg. 7). That way the 7 won't become
a part of the word pg.--and fool ZK. Some people like to end
each entry in a KWIC index with some sort of character denoting the
end of the string. For example, I've found that putting '>>' on
the end of each line in the input file improves readability
(since you can see where the line logically begins...
p. 7>> Why Indexers ║GO ║crazy early in Life ASIS v7n2
Routines for Evening ║HOURS ║80 Micro Apr 79, p. 4>> Sort
d. Your input file must be plain ASCII text.
e. There's no minimum limit for line length in your input file.
Lines shorter than the 35 characters will be padded as
necessary--to insure that the highlighted word appears
within the vertical lines on KWIC indexes.
Step 2. Check your stoplist.
ZKSTOP.LST is an ASCII file that contains your stop list; that is,
words that may appear in your final product but words that you do
not want highlighted in the MAIN HEADING area.
In practice you may well discover that it is best to customize your
stoplist for a given index--some words will be common to a particular
data file but not so common otherwise. Note that the more words you
add to ZKSTOP.LST, the slower ZK will run--it has more possibilities
to check as it moves through the file...
Here's an excerpt of ZKSTOP.LST as distributed:
-------------------
THE
OF
A
BY
PG
* You may purge numbers listing as MAIN HEADINGS by entering the
* first digit of a number and putting a + sign after it. For
* example, 1+ will purge 1, 10, 199, 1324234234 and so on.
* If you omit the '+' sign, ZKSTOP will only purge the
* single digit, not any digit that begins with that number.
1+
2+
3+
4+
* END OF ZKSTOP.LST
-----------------------
You may put comments in ZKSTOP.LST, just include a '*' in the line
with the comment. ANY line that contains a * will be ignored, so don't
put words with *'s in them in your list--they won't be stopped.
If you should lose or destroy ZKSTOP.LST, you may recreate it.
It must be a plain ASCII file with one word on a line. The
last line in the file must be a comment (to signal ZK that
the end of the file has been reached). You may have up to 400
words in this file, but no word may be longer than 20 characters.
Here's what happens if you omit stopword processing (or leave out
number-stopping in your list):
Popular Mechanics Nov |79, |p. 43-4 Fixing your Leading Edge PC
ies? True Stories Nov |88 |Are KWIC indexes actually promoted
xing GMUtant News Jan |89 |page 2 Who says you need a mainframe
The highlighting of 79, 88, and 89 isn't very informative.
Other notes:
When creating a KWOC index, ZK will trim punctuation off the end of the
keyword but will leave any other punctuation that occurs intact.
Be sure you have about 12 times the disk space required to store your
input file before beginning ZK.
The fewer stop words you have, the faster ZK will run.
Another GMUtant Software product you might find interesting:
BIBL (rhymes with nibble). A full-featured personal library management
system. BIBL's features include: full mouse support, ASCII import/
export, bibliography production, multiple databases, may be
run in read-only mode if desired, boolean (and/or/not) searching,
support for external editor/viewer programs, data import from
CD-ROM downloads and more. Registered version adds EMS storage
of indexes, use of WordPerfect as editor/printer, global find & replace
on any field, and more.
BIBL matches up well with ProCite and other bibliographic
packages. BIBL is available via Public Software Library, the
GMUtant OnLine BBS, or CompuServe (IBMAPPS forum).
Current release: 5.70
Questions? Comments?
C.W.Grotophorst
Phone: (703) 993-2239
BBS: (703) 993-2219 24 hrs per day, 1200/2400/9600 v.32
Bitnet: WALLYG@GMUVAX
CompuServe: 70404,3376